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1. INTRODUCTION 



One of the most important features of any computer is its arithmetic. This docu- 
ment discusses the implementation of floating point arithmetic in the Burroughs 
Scientific Processor (BSP). Data representation in both the BSP memory and 
arithmetic element is described, as are the arithmetic algorithms used in the 
BSP. Of particular interest are the techniques used for error checking in the 
arithmetic element and for rounding in both the scalar processor and the parallel 
processor. The BSP arithmetic operations, including instructions and cycle 
operations, are described in detail in Appendix A, and the accuracy of arithmetic 
operations is discussed in Appendix B. 
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2. 



DATA REPRESENTATION IN MEMORY 



The representation of data in the memory of the Burroughs Scientific Processor 
(BSP) is as follows: 

1. Single precision floating point word format, 

2. Integer word format, 

3. Double precision real floating point word format. 
SINGLE PRECISION FLOATING POINT WORD FORMAT 

A single precision floating point number, X, is represented by an ordered pair of 
numbers, E (exponent) and m (mantissa), such that: 



In order to meet this condition, that is, 1/2 <|m|< 1, the last step in every floating 
point operation is the normalization of the mantissa (removal of leading zeroes). 
The layout of the floating point word in memory is indicated below. The bits are 
numbered from right to left. The least significant bit is numbered 0; the most 
significant bit of the mantissa is bit 35; the least significant of the exponent is bit 
36. The most significant bit of the exponent is bit 45. Bit 46 is the sign bit of the 
mantissa; bit 47 is the sign bit of the exponent. Every group of consecutive bits is 



X 




where E is an integer and m satisfies the condition: 



-1/2 < m <- 1 or 1/2 < m < 1 or m = 0. 
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called a field, and is denoted by W [x:y]; W is the name of the data unit x is the 
address of the left-most bit, and y is the length of the field. Thus, a data word 
of the data unit, A, is defined as A [47:48], and its null indicator is defined as 
A[48:l]. 

Using field notation X [leading bit:numbers of bits], the mantissa is represented 
by X [35:36], the exponent by X [45:10] , the sign bit of the mantissa by X[46:l] . 



47 


46 


45 36 


35 


0 






Exponent 


Mantissa 



•->Sign of Mantissa 
->Sign of Exponent 



The range of representable numbers in single precision is as follows: 

1. For positive X: 

2" 1023 * 1/2 < X < 2 1023 * (l-2~ 36 ), where 

2- 1023 *l/2 = 1( f 308 - 25 

2. For negative X: 

- 2- 1023 *l/2>X>-2 1023 *(l-2- 36 ), where 
2- 1023 *(l-2- 36 ) = 10 307 - 95 . 



INTEGER WORD FORMAT 



An integer, I, is defined by its absolute value m (I) and by its sign bit S (I). Field 
I [35:36] contains the magnitude, and the sign bit is in field [46 :l]. The unused bits 
of the data word are set to 0. The range of integer values is symmetric about zero. 



-2 36 + 1 < I < 2 36 -1 



46 



35 




Integer 



Sign of Integer 
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DOUBLE PRECISION REAL FLOATING POINT WORD FORMAT 

A double precision floating point number X is represented by two single precision 
numbers FIRST (X) and SECOND (X). Both of these numbers are normalized. The 
mantissa sign bits must be the same. Due to normalization, the relationship be- 
tween exponents is: 

EXPONENT (SECOND (X)) < EXPONENT (FIRST (X)) -36 

SECOND (X) = 0 is a valid word. 

Thus, a double precision floating point word is the sum D(X) = FIRST (X) + 
SECOND (X). The BSP double precision format is different from the B 7800 
double precision format. In the B 7800, the exponents and the mantissas of the 
two single precision words are concatenated, and the two words form one entity. 

The range of a double precision number X is given as follows: 

1. For positive X: 

2" 1023 * 1/2 < X < 2 1023 * (1-2- 36 ) + 2 987 * (1-2- 36 ) 

2. For negative X: 

-[(2 1023 * (l- 2 - 36 ) + 2 987 * (1-2- 36 )] <X_<-2- 1023 * (1/2) 

The double precision format of the BSP, specifically the fact that the exponent of 
the second word is less than or equal to the exponent of the first word minus 36, 
can lead to an underflow condition in the second word, while there is no underflow 
in the first word. For example: 

[2- 951 *(l- 2- 36 ) + 2- 1023 *(l-2- 36 )]*l/2 

The multiplication of the number in brackets by 1/2 will lead to underflow in the 
second word. However, the underflow condition can be disabled. 
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3. DATA REPRESENTATION IN THE 
ARITHMETIC ELEMENTS 



BASIC DATA REPRESENTATION 

The basic BSP data word consists of 48 binary digits (bits). To this word are 
appended 12 other bits to form a data unit: a one-bit null indicator, three error 
condition bits (underflow, overflow, and undefined), four guard bits, two modulo 3 
residue code bits for the mantissa, and two modulo 3 residue code bits for the 
exponent. For input and output purposes, only 49 bits are used: the 48-bit data 
word, and the null indicator. The positions or addresses of binary digits within 
the data unit are designated by the decimal numbers 0 to 59. Within the data 
word, the bits are numbered 0 to 47. The data format in the arithmetic unit is 
given in the following diagram: 



47 46 45. . .36 35 . . 0 49 50 51 52 53 54 55 48 56 57 58 59 



X : 



Exponent Sign — 
Sign of Mantissa 
Exponent 
Mantissa 



Guard Bits 
Underflow ' 
Overflow — 
Undefined ~ 
NuU 



Mantissa Residue 
Exponent Residue 



"TfC 
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REPRESENTATION OF ZERO 

Integer, single precision, and double precision zero are represented with all bits 
set to zero. So called dirty zeroes are eliminated by hardware action. Operations 
which may result in dirty zeroes, for example: (0) * (-5) or N-N are tested. When 
a zero mantissa is found, sign bits and exponent are also set to zero. 



COMPLEX NUMBER SINGLE PRECISION) 

Complex numbers are implemented implicitly. Two contiguous, single precision 
floating point numbers represent a complex number Z, where the first number is 
the real part of Z and the second number is the imaginary part of Z. No double 
precision complex numbers are provided. Single precision complex operations 
are implemented by software using real arithmetic hardware operators. 



8 



BSP - 



- BURROUGHS SCIENTIFIC PROCESSOR 



4. HARDWARE ERROR CHECKING 



Error checking in the arithmetic element is done in three ways: 

1. Errors which result in overflow, underflow, or undefined 
operations within arithmetic elements. 

2. Errors within the arithmetic elements are checked by a 
modulo 3 residue code. Exponent and mantissa are checked 
separately (two bits each). A modulo 3 residue code is 
limited; it cannot detect errors which are multiples of three. 
A modulo 3 error results in an interrupt. Errors are re- 
ported for logging and to the parallel memory control to dis- 
able parallel memory write. 

3. Errors in data transmission. A Hamming Code generator 
computes seven parity bits of a Hamming Code over the 48 
bits of data. This code is a single error correction/ double 
error detection code (SEC/DED), which protects the parallel 
memory and the data transmission of the alignment networks. 
Input data from the arithmetic elements are encoded by the 
alignment networks; input data from the control processor 
and from file memory are already encoded. 
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Detected errors are logged in the maintenance log. The Hamming Code will re 
port bit error, arithmetic element numbers and error type. There are three 
classes of errors: 

1. Tolerable error (e. g. , Hamming Code single bit error), 

2. Fatal error (e.g. , memory address error during write cycle) 

3. Retryable error (e. g. , double bit Hamming Code error, that is, 
conditional, overflow, underflow, undefined). 

These three errors are reported as interrupts to the Master Control Program 
(MCP). It is up to the MCP to determine the appropriate action. 

Fatal and retryable errors will inhibit a write cycle. Depending on error type, 
a retry or a system shutdown as initiated. 
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5. ARITHMETIC ALGORITHMS 



IMPLEMENTATION OF RECIPROCATION AND SQUARE ROOT 

The BSP implements reciprocation and the square root by iterative procedures. Both 
algorithms are applications of the Newton-Raphson method that utilize multiplica- 
tions and additions only. Both algorithms are partially hardware-implemented. 

The Newton-Raphson procedure has quadratic convergence, and the accuracy which 
can be achieved depends on the choice of the starting value and on the number of 
iterations. 

In order to obtain the machine accuracy of 36 bits in reciprocation and in the square 
root, it was required that the necessary accuracy is to be obtained in three 
iterations. 

RECIPROCATION 

To obtain the reciprocal of a floating point number A, Newton's method is used 
to solve the equation: 

F (X) = 1 /X - A = 0 

The first step is to find the initial approximation Xq. This is done via a table 
lookup from a ROM. Recall that the internal representation of A is as (E . m), 
where E is the exponent (base 2) and m is the normalized mantissa 1/2 < m < 1. 
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Thus, 1/A = 2"- tj * 1/m and reciprocation is reduced to the computation of 1/m 
in the interval (1/2, 1). The function 1/m in the interval (1/2, 1) is shown below: 




V2 < m < 1 
2>l/m>l 



The range of the function of 1/m in the interval (1/2, 1) is (2, 1). Since it is re- 
quired that division and the square root operations be accurate to 36 bits within 
three iterations, the starting value for the Newton-Raphson iteration has to be at 
least five bits, assuming quadratic convergence. However, since the slope of the 
curve is not constant, one has also to consider the maximum slope of the curve. 
The maximum slope of 1/m is - l/m^. At m = 1/2, the slope is -4. Therefore, 
an additional two bits are required to achieve the required accuracy in the neighbor- 
hood of m = 1/2. For practical reasons, a ROM of eight bits is used, which corres- 
ponds to a table of 2 56 entries. 

When Xq has been determined from ROM, and are formed using the recursion: 

X J _ 1 = X * (2-A * x ). 
n+1 n n 

To get the iteration started, seven bits are taken from ROM. The first iteration 
is accurate to 13(14) bits , the second iteration yields 2 5(2 6) bits. These 2 5 bits 
are truncated to 19 bits; the last iteration results in 38 bits. The last two bits 
are used as guard bits. In the present application, the algorithm approaches the 
true value always from below, and hence, the rounding is not unbiased. The round- 
ing error is less or equal to 2~36 # if A = 0, the result is undefined; if A < 2~ 0 ^, 
the overflow flag is set. 

Double precision reciprocation requires an additional Newton-Raphson iteration. 
All operations are executed in double precision. The double precision reciproca- 
tion is accurate to 70 bits. 
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DIVISION 

If A and B are single precision floating point numbers, A/B is found in two steps: 

1. 1/B is found by reciprocation 

2. R = A * (1/B). 

The maximum error in floating point division is: 

(1/A) (1 + Ej) * B (1 * E ) « A/B (1 + E x + E^ 

where: 

|e x < 2~ 36 and 

-36 



| E 1 + E l I 2 * 2 

If A and B are double precision floating point numbers, A/B is found in two steps: 

1. 1/B is found by reciprocation 

2. R = A * (1/B). 

Double precision division is accurate to 70 bits. 

SQUARE ROOT 

If A is a single precision floating point number, the reciprocal square root \j*fh 
is found by solving the equation: 

F(X) = 1/X 2 - A = 0. 

The representation of A is 2 E * m. Therefore, 1/jA = 1/^2^ * m = \ * lA/rn. 

If the exponent, E, is odd, 2 E is multiplied by two to make it even. The mantissa 
is multiplied by 1/2. The range of the mantissa is thus changed to 1/4 < m < 1. 

As in division, the first iteration for X Q , lA/m, is read from a ROM. Subsequent 
iterations are obtained by solving: 

X , = X * (3-X 2 * A)/2. 
n+1 n ' 
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Seven bits suffice to start the iteration. The first iteration yields 13(14) bits, 
the second iteration generates 25(26) bits which are truncated to 19. In the last 
step, 38 bits are generated: 36 bits plus 2 bits for rounding. The rounding is 
biased, because the true value of l/v^C is approached from below. The error in 
the reciprocal square root is estimated as = 2 * 2~3& t The square root itself 
requires an additional multiplication. The total error in the square root is, there 
fore, E < 2 * 2" 36 . 

The double precision reciprocal square root requires an additional iteration. In 
the last iteration, all operations are executed in double precision. 

The square root of a single precision number is found in two steps: 

1. Y = 1/A 

2. R = A * Y. 

To compute the double precision square root, steps 1 and 2 above are done with 
a double precision reciprocal square root computed first, followed by a double 
precision multiplication. 
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6. ROUNDING AND NORMALIZATION 



In the control processor, rounding is done after each binary floating point opera- 
tion as well as after reciprocation and after reciprocal square root. 

In the parallel processor, rounding is done at the conclusion of each binary float- 
ing point operation and after reciprocation and reciprocal square roots. This type 
of rounding applies also to the execution of templates such as triads, tetrads, etc. 
The rounding operations correspond to the rounding operations in a sequential 
machine although in some instances the sequence of operations may be changed. 

ROUNDING - SINGLE PRECISION 

The rounding rules in arithmetic operations -- addition, subtraction, multiplication, 
reciprocation and reciprocal square root are: 

1. If the discarded portion of the binary fraction is less than half of 
the least significant bit, leave the least significant bit unchanged. 

2. If the discarded portion is greater than half of the least significant 
bit, add one to the least significant bit with full carry propagation. 

3. If the discarded portion is exactly half of the least significant bit, 
set the least significant bit. 

The number of bits used for rounding in different operations differs, depending 
upon the operation. Addition and subtraction use four rounding bits; multiplication 
uses 18 bits; reciprocation and reciprocal square root use two bits. 
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The following examples demonstrate the rounding rules for addition, using a 
mantissa of four bits and two guard bits. Rounding for the other rounding opera- 
tions follows the same rules, except that the number of guard bits varies as stated 
previously. 

Example la; 1. 0 + 15/16 = 31/16 



2 * (. 1000) 
+ 2° * (. 1111) 



2 1 * 
+ 2 1 * 



after rounding 



2 1 * 
2 1 * 



Example lb; 9/8 + 11/16 = 29/16 
1 



2 * (. 1001) 
+ 2° * (. 1011) 



2 1 * 
+ 2 1 * 



after rounding 



2 1 * 
2 1 * 



Example 2; 3/16 + 15/16 = 45/16 
1 



2 *(. 1111) 
+ 2° * (. 1111) 



2 1 * 
+ 2 1 * 



2 1 *( 
2 

2 * 



after rounding 2 * 
Example 3: 40/16 + 15/16 = 55/16 



2 * (. 1010) 
+ 2° * (. 1111) 



2 

2 * 
2 

+ 2 * 



2 2 * 



after rounding 2 * 



. 1000) 
.0111) 1 



1111) 1 
1111) 



1001) 
0101) 1 



1110) 1 
1111) 



. 1111) 
.0111) 1 



1.0110) 1 
1011) 01 
1011) 



1010) 
0011) 11 



. 1101) 11 
. 1110) 
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ROUNDING - DOUBLE PRECISION 

Double precision operations are not rounded: there are no guard bits. 
NORMALIZATION 

All real numbers are stored in normalized form, that is, the leading bit of the 
mantissa is always a one. 

1/2 < mantissa < 1 

In double precision, both words are normalized. 
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APPENDIX A 
ARITHMETIC OPERATIONS 

Instruction Mnemonic Time in Clocks 

Single Precision, Floating Point Family 



Add 


ADD 


2 


Subtract 


SUB 


2 


Multiply 


MUL 


2 




txi-i K-j Lir 


D 


Divide 


DIV 


8 


Reciprocal Square Root 


SQRTR 


10 


Square Root 


SQRT 


12 


Square 


SQR 


2 


Extract Exponent 


EXTX 


2 


Insert Exponent 


INSX 


2 


Binary Scale to Left 


BSCL 


2 * 


Binary Scale to Right 


BSCR 


2 * 


Normalize 


NORM 


2 * 


Truncated Add 


TADD 


2 


Truncated Subtract 


TSUB 


2 


Truncated Multiply 


TMUL 


2 


Maximum 


MAX 


2* 


Minimum 


MIN 


2 * 


Absolute Maximum 


A MAX 


2 


Absolute Minimum 


A MIN 


2 



*These operations can be preformed in one clock if they are incorporated in the 
appropriate template. 
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Instruction Mnemonic Time in Clocks 

Extended Precision Family 

(Single Precision Operand, Double Precision Result) 

Extended Add EADD 5 

Extended Subtract ESUB 5 

Extended Multiply EMUL 4 

Double Precision Family 

Double Precision Add DP ADD 8 

Double Precision Subtract DPSUB 8 

Double Precision Multiply DPMUL 11 

Double Precision Reciprocate DPREC 16 

Double Precision Divide DPDIV 2 7 

Double Precision Reciprocal Square Root DPSQRR 21 

Double Precision Square Root DPSQR 32 
Double Precision Round to Single Precision SNGL 2 

Integer Family 

Float Integer FLOAT 2* 

Integer Add IADD 2* 

Integer Subtract ISUB 2* 

Integer Multiply IMUL 3 

Integer Divide IDIV 15 

Type Transfer Operations 

Integer with Truncation FIXT 2 * 

Integer with Rounding FIXR 2 * 

Integer with Floor FDCF 2 

Integer with Ceiling FDCC 2 

Integer to Floating Point FLOAT 2 * 

Normalize NORM 2* 

Double Precision to Single Precision SNGL 2 

♦These operations can be performed in one clock if they are incorporated in the 
appropriate template. 
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Cycle Operations 
ADD/SUBTRACT 

1. A ± B— R 



G 



2. NORM(R^)— Z 



Single Precision 



(Round) 



MULTIPLY 

1. A * LSH(B)— PP 

2. A * MSH(B) + PP— Z 



Single Precision 

LSH(B) = Least Significant Half (B) 
MSH(B) = Most Significant Half (A) 



RECIP (A) 

1. 2- (A * PROM(A))^P 1 



2. P x * PROM (A)— -R 

3. 2- (A * R )— P 2 
4 p * r — »R 

5, 2 - (A * R_ ) — p 

2 3 

6. P 3 *R 2 -z 



Single Precision 

Table look-up (7 bits) 



(Round) 



DIV 



1-6. RECIP (B)— *C 
7-8. A MULT B — • Z 



Single Precision 

(Round) 

(Round) 



SQRTR (A) 



Single Precision 



1. A shift if exp. odd — A' 

2. A 1 * PROM (A 1 ) — P i 



3. 3-P 1 * PROM (A ) — P 



4. P 2 * PROM (A ) 



R. 
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5. A 1 * R ► P 

J. o 



6. 3-P Q * R n ► 

3 1 4 



7. P 4 * Rl .R 2 

8. A 1 * R 2 . P 5 



9. 3-P 5 *R 2 _P 6 



10. P 6 *R 2 



DPSQRR Double Precision Reciprocal Square Root 

1-10. SQRTR(A 1 ) — B x & A x shift if exp odd — aJ 

11. A 1 + A 2 .-A* (A* has A.^ exponent) 

12. A*— A* (pass to R Q ) 

13. A* shift if exp odd — ""A** 
14-16. B 1 * (A* + A* 1 ) 

17. "" C l +C 2 

18-20. 3-B * (C x + C 2 ) 

21. -D! + D 2 

22-24. B x * (D x + D 2 ) 

25. — ► Z x + E 2 



26. 



—** output 



2 7. NORM (E 2 ) — *Z 2 
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DPMUL Double Precision MULT 

1. A * B l 

2. — D 1 + D 2 

3. Al *B 2 

4 - ^ E 1G 

5. A 2 * Bl 

6 - — F 1G 

7 - F 1G + E 1G— G 1G 

8 - G 1G + D 2 — H 1G 

9 " H 1G + D 1 — Z l + F 2 

10. H 1G + Dl ^Z 1 + C 2 

11. NORM(C„) — Z„ 
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DPREC Double Precision Reciprocation 

1.-6. RECIP(A 1 ) — B x (No Rounding) 

7. A 2 * B x 

8. -^ C 1G 

9. 2 ~ A \ * B i 

10. — D l +D 2 

U - - C 1G + D 2- E 1G 

12. E 1G * Bl 

13 ' — F 1G 

14. b i+ F 1g ^Z 1 + B 2 

15 ' B 1 + F 1G— Z l + B 2 

16. NORM(B 2 ) — ^Z 2 



SQRT (A) Single Precision 

I- 10. SQRTR(A) — C 

II- 12. C MULT A — Z 
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EADD/ESUB Extended Add/ Sub 

1. A±B-^P 1 + R 1 



2. NORIVKP^— »P 



3. P 3 + R x — Z + P 4 



4. P 3 + Ri ^z i+ P 4 



5. NORM(P 4 )^Z 2 



EMUL Extended Multiply 

1. A * LSH(B)— PP 



2. A * MSH(B) + PP — Z x + P 9 



3. Z — Zl 



4. NORM(P 2 )—Z 2 



DPADD/DPSUB Double Precision ADD/SUB 

2. A 1±Bl — D 1+ D 2 

2. ± B 2+ D 2 ^E 1G 

3 - E 1G + A 2- F 1G 

4. F 1G + Dl _G 1+ G 2 

5. NORM(G )— P 

6. P, + G 0 -— C, + C 0 

7. NORM(C 1 )— Z x 

8. NORM(C 2 )— Z 2 
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APPENDIX B 

ERROR ESTIMATES FOR ARITHMETIC OPERATIONS 





Operation 


Single Precision 


Addition/ Subtraction 


A ± B 


(A ± B) (1 + E ) 


Multiplication 


A * B 


(A * B) (1 + E ) 


Reciprocation 


1/A 


1/A (1 + E ) 


Division 


B/A 


B/A (1 + E x ) ( 1 + E x ) 


Reciprocal Square Root 


iAA 


(1 + E ) 


Square Root 


A/vA 


A/A (1 + E x ) (1 + E ) 



Floating Point 

Double Precision 



(A ± B) (1 + 4E 2 ) 



(A * B) (1 + 4E 2 ) 
1/A (1 + 4E 2 ) 



1/vA (1 + 4E 2 ) 



where: 



I~ 3 6 
<_ 2 for single precision instructions, 



- 72 

< 2 for double precision instructions. 
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